6. Data Management Considerations

Methods presented in Section 5: Statistical Tests and Methods vary in complexity from relatively straightforward graphical methods to complex matrix-based procedures, such as krigingA weighted moving-average technique to interpolate the data distribution by calculating an area mean at nodes of a grid (Gilbert 1987).. Correspondingly, the software packages listed in Appendix D rangeThe difference between the largest value and smallest value in a dataset (NIST/SEMATECH 2012). in capabilities from specialized spreadsheet-based groundwater calculators to comprehensive high-powered statistical software suites that are not industry specific. Most of these packages will accept input data from spreadsheets or text files, and many commercial packages are able to connect directly to user databases. Regardless of the system used, input data files should always be provided with statistical analysis deliverables (in electronic format) to allow for verification and cross-checking with different models, as appropriate.

Data management strategies will vary depending on the amount and type of data collected using a systematic planning process, as presented in Section 3. For example, a small dry cleaner site may conduct trend analysis on source or boundary wells or both to evaluate concentration changes over time for post-injection monitoring of an in situ bioremediation remedy. Tracking groundwater monitoring data using spreadsheet software may be sufficient for a project of this nature.

For large, complex, multi-source CERCLAComprehensive Environmental Response, Compensation, and Liability Act sites where there are numerous contaminants and separate monitoring systems a more sophisticated statistical approach may be warranted. With a large data set, preparing groundwater data for statistical analysis can be more time consuming than performing the analysis itself. For these sites, it can be more cost-effect to invest in a more robust data management solution. Commercial environmental data management software is available for this purpose. Comprehensive enterprise-level products developed under direction of the Department of Defense include:

Most labs now deliver analytical results electronically and several state and federal organizations have established specific electronic data deliverable (EDD) format requirements. USEPA has developed the staged electronic data deliverable (SEDD) format to support uniform delivery, review, storage, and retrieval of laboratory data. (USEPA 2011a)

However, site data management may be complicated by turnover in site project managers and regulators over the life of a remediation project. Historical data may only be available as hard copy tables, presentation-level crosstab spreadsheets, or in other formats. Cleanup and conversion of legacy data can be very time and labor-intensive, so users must balance level of effort needed to convert data to usable form with the value of data to the statistical approach. See Section 3.3.2: Historical Data for additional discussion on usefulness of historical data for statistical evaluation. If the data set is small, it may be fastest to hand-enter data needed for analysis. Information regarding methods for automated data conversion and cleanup, such as scanning and optical character recognition, are available online

Good Practices for Managing Groundwater Monitoring Data

The general “good practices” listed below will help streamline data analysis and provide a basic structure listing for well construction, analytical results, field data, and geographical coordinates. This structure can be expanded upon as additional data needs are identified. The information presented here is intended as a starting point, you should determine the database formats and information requirements for each project. For more comprehensive data systems, users should follow established data standards such as USEPA’s SEDDstaged electronic data deliverable referenced above.

  1. Provide well construction data for each well/monitoring interval in a single row for each well/screen interval:

    Well Number

    Well Diameter

    Total Depth

    Top Of Screen

    Length Of Screen

    Top Of Casing Elevation

    Reference Datum

                 
    • Total depth measurements are typically entered as a positive depth value qualified as below ground surface or BGS.
    • For sites with complex geology, parameters such as depth to first water after drilling, depth of drilling fluid circulation loss or other relevant measurements may also be tracked.
  2. Provide analytical results and groundwater elevations in a “flattened” format, in which each row of data contains data collected from a single well/screen interval for a single contaminant on a single date. Tabulated analytical results should also include lab analysis qualifiers (such as I, J, and U), practical quantitation limits, and method detection limits to allow flexibility in identifying and managing nondetect values and potential outliers.

    Well Number

    Sample Date

    Contaminant A

    Concentration A

    Lab Qualifier A

    Pql A

    Mdl A

    Preparation Method

    Analytical Method

                     
    • Contaminant listings should include a field for Chemical Abstract Service Registry Number (CASRN) or other standardized designation since many chemicals may be identified under multiple names. For example, tetrachloroethene is also known as perchloroethene, perchloroethylene, Perc, and PCE.
    • All numeric results should be formatted to a predetermined precision (number of decimals). For most contaminants whole numbers are adequate, however there are a few where the nth decimal place is the difference between leave and remediate. If this is not set before data collection, columns could be incorrectly formatted and values set to "0" by accident.
  1. Present analytical measurements in three columns:
    • One column is for the quantified value for that sample or a reporting limit if the sample is nondetect
    • Another column is for a (possibly numeric, such as 1 for detected and 0 for nondetect) flag signifying the status of that sample (such as detected, trace, nondetect). Standardized lab qualifiers also serve this purpose and can be stored in this column.
    • A third column is for the units of the measurement consistent risk assessment or criteria such as μg/L or mg/L. Use of “parts per million” is not acceptable for groundwater evaluations.

      Result StatusUnits

      5

      0

      mg/l

      Use this format rather than, for example, “<5”, “5J”, or a similar notation in the result column because most software will not function properly when numeric values are combined with text or symbols in the same column.

4. Tabulate field sampling results by single well/screen interval and date to verify that samples come from the same target population.

Well Number Sample Date Static Depth to Water pH Temperature Conductivity Dissolved Oxygen Turbidity ORP
                 

 

5. While geospatial analysis is beyond the scope of this guidance, consistently collect and manage geographical coordinates and well survey elevations to simplify groundwater data analysis.

Well Number Sample Date Latitude (Decimal Degree Or Degree, Minute, Second) Longitude (Decimal Degree Or Degree, Minute, Second) Collection Method Datum Verification Method
             

6. Provide source references (such as lab reports, field notes) for all data stored in the system to verify integrity.

7. Always back up your data.

Publication Date: December 2013

Permission is granted to refer to or quote from this publication with the customary acknowledgment of the source (see suggested citation and disclaimer).

 

This web site is owned by ITRC.

1250 H Street, NW • Suite 850 • Washington, DC 20005

(202) 266-4933 • Email: [email protected]

Terms of Service, Privacy Policy, and Usage Policy

 

ITRC is sponsored by the Environmental Council of the States.